Download Improved hidden Markov model partial tracking through time-frequency analysis
In this article we propose a modification to the combinatorial hidden Markov model developed in [1] for tracking partial frequency trajectories. We employ the Wigner-Ville distribution and Hough transform in order to (re)estimate the frequency and chirp rate of partials in each analysis frame. We estimate the initial phase and amplitude of each partial by minimizing the squared error in the time-domain. We then formulate a new scoring criterion for the hidden Markov model which makes the tracker more robust for non-stationary and noisy signals. We achieve good performance tracking crossing linear chirps and crossing FM signals in white noise as well as real instrument recordings.
Download Sparse Atomic Modeling of Audio: a Review
Research into sparse atomic models has recently intensified in the image and audio processing communities. While other reviews exist, we believe this paper provides a good starting point for the uninitiated reader as it concisely summarizes the state-of-the-art, and presents most of the major topics in an accessible manner. We discuss several approaches to the sparse approximation problem including various greedy algorithms, iteratively re-weighted least squares, iterative shrinkage, and Bayesian methods. We provide pseudo-code for several of the algorithms, and have released software which includes fast dictionaries and reference implementations for many of the algorithms. We discuss the relevance of the different approaches for audio applications, and include numerical comparisons. We also illustrate several audio applications of sparse atomic modeling.
Download Analysis/Synthesis Using Time-Varying Windows and Chirped Atoms
A common assumption that is often made regarding audio signals is that they are short-term stationary. In other words, it is typically assumed that the statistical properties of audio signals change slowly enough that they can be considered nearly constant over a short interval. However, using a fixed analysis window (which is typical in practice) we have no way to change the analysis parameters over time in order to track the slowly evolving properties of the audio signal. For example, while a long window may be appropriate for analyzing tonal phenomena it will smear subsequent note onsets. Furthermore, the audio signal may not be completely stationary over the duration of the analysis window. This is often true of sounds containing glissando, vibrato, and other transient phenomena. In this paper we build upon previous work targeted at non-stationary analysis/synthesis. In particular, we discuss how to simultaneously adapt the window length and the chirp rate of the analysis frame in order to maximally concentrate the spectral energy. This is done by a) finding the analysis window that leads to the minimum entropy spectrum; and, b) estimating the chirp rate using the distribution derivative method. We also discuss a fast method of analysis/synthesis using the fan-chirp transform and overlap-add. Finally, we analyze several real and synthetic signals and show a qualitative improvement in the spectral energy concentration.
Download Modal Analysis Of Room Impulse Responses Using Subband Esprit
This paper describes a modification of the ESPRIT algorithm which can be used to determine the parameters (frequency, decay time, initial magnitude and initial phase) of a modal reverberator that best match a provided room impulse response. By applying perceptual criteria we are able to match room impulse responses using a variable number of modes, with an emphasis on high quality for lower mode counts; this allows the synthesis algorithm to scale to different computational environments. A hybrid FIR/modal reverb architecture is also presented which allows for the efficient modeling of room impulse responses that contain sparse early reflections and dense late reverb. MUSHRA tests comparing the analysis/synthesis using various mode numbers for our algorithms, and for another state of the art algorithm, are included as well.